-
Notifications
You must be signed in to change notification settings - Fork 9
Evaluation: Refactor cron #594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThe evaluation cron processing system is refactored from organization-centric grouping to project-centric grouping. The polling function signature simplifies by removing the organization ID parameter, processing all pending evaluation runs across all organizations and grouping them by project. Response structures are updated to reflect per-run details instead of per-organization summaries. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant CronRoute as Cron Route<br/>(cron.py)
participant ProcessCron as Process Cron<br/>(cron.py)
participant PollFunc as Poll Function<br/>(processing.py)
participant Database as Database<br/>Session
participant ProjectGroup as Project<br/>Grouping
Client->>CronRoute: GET /cron/evaluation_jobs
CronRoute->>ProcessCron: process_all_pending_evaluations_sync(session)
ProcessCron->>PollFunc: await poll_all_pending_evaluations(session)
PollFunc->>Database: Fetch all pending evaluation runs
Database-->>PollFunc: List of pending runs
PollFunc->>ProjectGroup: Group runs by project_id
ProjectGroup-->>PollFunc: Dict[project_id, List[runs]]
loop For each project
PollFunc->>PollFunc: Initialize OpenAI/Langfuse<br/>clients per project
PollFunc->>PollFunc: Process all runs<br/>in project
PollFunc->>Database: Update run statuses/results
end
PollFunc-->>ProcessCron: Summary dict<br/>(total_processed, total_failed,<br/>total_still_processing, details)
ProcessCron-->>CronRoute: Processed response
CronRoute-->>Client: Response with run-level<br/>details (no org grouping)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
backend/app/crud/evaluations/processing.py (1)
750-755:⚠️ Potential issue | 🟡 Minor
"embeddings_completed"and"embeddings_failed"are counted asstill_processing.
check_and_process_evaluationcan return actions"embeddings_completed"and"embeddings_failed"(lines 515, 536), both of which leave the eval run in"completed"status. Because they don't match"processed"or"failed", they fall into theelsebranch and incrementtotal_still_processing_count, which is misleading in the summary.Proposed fix
if result["action"] == "processed": total_processed_count += 1 - elif result["action"] == "failed": + elif result["action"] in ("failed", "embeddings_failed"): total_failed_count += 1 + elif result["action"] == "embeddings_completed": + total_processed_count += 1 else: total_still_processing_count += 1backend/app/api/routes/cron.py (1)
54-60:⚠️ Potential issue | 🟡 MinorError response is missing
"results"key, inconsistent with the success shape.The success path (line 47) returns a dict that always contains
"results"(set byprocess_all_pending_evaluationsincron.py). The route-level error handler here omits it, which may break callers expecting a uniform schema.Proposed fix
return { "status": "error", "error": str(e), "total_processed": 0, "total_failed": 0, "total_still_processing": 0, + "results": [], }
🧹 Nitpick comments (3)
backend/app/crud/evaluations/processing.py (1)
699-701: Derivingorg_idfrom the first run is safe given the data model, but fragile if the assumption changes.
org_id = project_runs[0].organization_idrelies on all runs sharing the same org for a given project. This holds becauseProjectbelongs to a singleOrganization, but consider adding a brief comment explaining why this is safe (the FK relationship), so future readers don't question it.backend/app/crud/evaluations/cron.py (1)
66-78:asyncio.run()may conflict with an existing event loop.
asyncio.run()creates a new event loop and fails withRuntimeErrorif one is already running. While FastAPI runs sync endpoints in a threadpool (no active loop there), this is fragile — if the endpoint is ever changed toasync def, or if this wrapper is called from any async context, it will break. Consider making the route handlerasync defand awaitingprocess_all_pending_evaluationsdirectly.Alternative: make the endpoint async and drop the sync wrapper
In
backend/app/api/routes/cron.py:-def evaluation_cron_job( +async def evaluation_cron_job( session: SessionDep, ) -> dict:- result = process_all_pending_evaluations_sync(session=session) + result = await process_all_pending_evaluations(session=session)Then this sync wrapper can be removed entirely.
backend/app/api/routes/cron.py (1)
19-21: Return type hint could be more specific.Per coding guidelines, all return values should have type hints.
-> dictcould be-> dict[str, Any]for consistency with the rest of the codebase. As per coding guidelines, "Always add type hints to all function parameters and return values in Python code".Proposed fix
-def evaluation_cron_job( - session: SessionDep, -) -> dict: +def evaluation_cron_job( + session: SessionDep, +) -> dict[str, Any]:You'll also need to add
from typing import Anyto the imports.
Summary
Target issue is #493
Checklist
Before submitting a pull request, please ensure that you mark these task.
fastapi run --reload app/main.pyordocker compose upin the repository root and test.Notes